Runaway Thread Management

The RTX64 watchdog timer can be used to monitor runaway RTSS threads.

When a single RTSS thread becomes CPU bound, it is often unintended and the result of a program logic error (which occurs most frequently during the development phase). When a high priority thread becomes CPU bound on the first RTSS processor, i.e., the smallest numbered RTSS processor, RTX64 tools and Windows applications may not be able to terminate the CPU-bound thread because the lower priority Windows application is unable to communicate with the Subsystem.

The RTX64 Scheduler can be configured to monitor for highly CPU-intensive RTSS threads and notify the RTX64 Subsystem if there are any threads that run nonstop for greater then the period of time that you specify. When the watchdog timer is enabled, the Subsystem keeps track of the amount of time threads run continuously on each RTSS core without yielding the CPU, and if any exceed the timeout period configured by the user, the Subsystem will generate a watchdog timeout exception which freezes all RTSS threads, no matter which thread causes the exception. This includes the RT-TCP/IP Stack.

NOTE: This feature is provided for debugging purposes only and is not recommended for deployment situations.

NOTE: After a watchdog timer exception occurs, the user is responsible for terminating all running RTSS processes.

NOTE: The default watchdog timer period is 5 seconds.

Watchdog Management

When monitoring is enabled, the watchdog timer checks on every timer interrupt for continuous running of the same thread. When the watchdog timer expires, a popup message notifies you of the error, and then Windows resumes normal operation. All RTSS processes that were running at the time of the exception are frozen. You can use the RtssKill utility to unload the frozen processes.

Upon timeout of the watchdog timer, the Subsystem freezes the offending process, regardless of the current exception handling settings. All other running processes are frozen. Once the Subsystem enters this state, you must use RtssKill to clear the frozen processes and return the Subsystem back to a working state.

To do this:

Use RtssKill to get the IDs of the frozen processes.
Run rtsskill <id> for each discovered process to terminate it.

For more information, see RtssKill Usage.

Once the frozen processes have been cleared, the Subsystem will continue running without restarting.

Notes

The watchdog timer currently monitors threads becoming CPU-bound. It does not monitor whether a core remains saturated with RTX64 activity for the timeout period.
The watchdog timer resets when a thread switch occurs on a core.
The watchdog timer will be not take affect when:

Set to a value greater than the TCP/IP Stack timer interval while the TCP/IP Stack is started.
A real-time application sets a timer with an expiration period that is less than the value set for the watchdog timer period.

When a process being debugged is about to starve, a breakpoint exception occurs that causes the Visual Studio debugger to break automatically. This allows you to further inspect process memory using either the Visual Studio IDE or WinDbg extensions.